CSVS Convert#



  • Thorough type guessing of CSV columns, so there is no need to configure types of each column. Scans whole file first to make sure all types in a column are consistent.

  • Quick conversions (uses rust underneath). Uses fast methods for each output format.

  • Tries to limit errors when inserting data into database by resorting to “text” if type guessing can’t determine a more specific type.

  • When inserting into existing databases automatically change schema of target to allow new data (evolve option).

  • Memory efficient. All csvs and outputs are streamed so all conversions should take up very little memory.

  • Gather stats and information about CSV files into datapacakge.json file and can use it for customizing conversion.


  • CSV files currently need header rows.

  • Whole file needs to be on disk as whole CSV is analyzed.

Conversion Docs#

Conversions for CSV files:

Full Option Reference

This is the python library, providing bindings to the rust library.

Contribute on github


pip install csvs_convert

Usage From CSV files.#

import csvs_convert

csvs_convert.csvs_to_sqlite("sqlite.db", ["file.csv"])
csvs_convert.csvs_to_postgres("postgresql://user:postgres@localhost/db", ["file.csv"])
csvs_convert.csvs_to_parquet("output", ["file.csv"])
csvs_convert.csvs_to_xlsx("output.xlsx", ["sqlite.db"])

Usage from datapackage#

A datapackage is a file that contains metadata about the tables its specification is described here.

To generate datapackage.json file you can use:

csvs_convert.csvs_to_datapackage('path/to/datapackage.json', ["fixtures/large/csv/data.csv"])

Also the other csvs_to_* commands return the datapackagage as a python dict eg:

    datapackage = csvs_convert.csvs_to_xlsx("output.xlsx", ["sqlite.db"])

You can use this file and alter it as needed. Mostly it is useful if you want to use the same schema across multiple files, as it will save time not having to do the type guessing for every file.

When referring to a datapackage you can either reference:

  • A datapackage.json file.

  • A datapackage directory containing a datapackage.json file. e.g. /a/datapackage/dir

  • A zip file containing a datapackage.json file. e.g. my_datapackage.zip


import csvs_convert

csvs_convert.datapackage_to_sqlite("sqlite.db", "path/to/datapackage.json")
csvs_convert.datapackage_to_postgres("postgresql://user:postgres@localhost/db", "path/to/datapackage.json")
csvs_convert.datapackage_to_parquet("path/to/directory", ["sqlite.db"])
csvs_convert.datapackage_to_xlsx("output.xlsx", "path/to/datapackage.json")